On the Comparison Complexity of the String Prefix-Matching Problem
نویسندگان
چکیده
In this paper we study the exact comparison complexity of the string prefix-matching problem in the deterministic sequential comparison model with equality tests. We derive almost tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line prefix-matching algorithms for any fixed pattern and variable text. Unlike previous results on the comparison complexity of string-matching and prefix-matching algorithms, our bounds are almost tight for any particular pattern. We also consider the special case where the pattern and the text are the same string. This problem, which we call the string self-prefix problem, is similar to the pattern preprocessing step of the Knuth-Morris-Pratt stringmatching algorithm that is used in several comparison efficient stringmatching and prefix-matching algorithms, including in our new algorithm. We obtain roughly tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line self-prefix algorithms. Our algorithms can be implemented in linear time and space in the standard uniform-cost random-access-machine model. ∗BRICS – Basic Research in Computer Science, Centre of the Danish National Research Foundation, Department of Computer Science, University of Aarhus, DK-8000 Aarhus C, Denmark. Partially supported by the ESPRIT Basic Research Action Program of the EC under contract #7141 (ALCOM II). Part of the research reported in the paper was carried out while this author was visiting at the Istituto di Elaborazione dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy, with the support of the European Research Consortium for Informatics and Mathematics postdoctoral fellowship. †Dipartimento di Matematica Pura ed Applicata, Università di Padova, Via Belzoni 7, I-35131 Padova, Italy. Partially supported by “Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo” of the Italian National Research Councile under grant number 89.00026.69. ‡Dipartimento di Matematica Pura ed Applicata, Università di Padova, Via Belzoni 7, I-35131 Padova, Italy. Parts of the research reported in this paper were carried out while this author was visiting at the Institut Gaspard Monge, Université de Marne-la-Vallée, Noisy-leGrand, France, supported by “Borsa di studi per attività di perfeziomento all’estero” from the University of Padua, and at BRICS, Department of Computer Science, University of Aarhus, Aarhus, Denmark, supported by the Gini Foundation of Padua.
منابع مشابه
Tight Comparison Bounds for the String Prefix-Matching Problem
In the string preex-matching problem one is interested in nding the longest preex of a pattern string of length m that occurs starting at each position of a text string of length n. This is a natural generalization of the string matching problem where only occurrences of the whole pattern are sought. The Knuth-Morris-Pratt string matching algorithm can be easily adapted to solve the string pree...
متن کاملTime Complexity of Knuth-Morris-Pratt String Matching Algorithm
This project centers on the evaluation for the time complexity of Knuth-Morris-Pratt(KMP) string matching algorithm. String matching problem is to locate a pattern string within a larger string. The best performance in terms of asymptotic time complexity is currently linear, given by the KMP algorithm. In this algorithm, firstly a prefix for the pattern string is computed and then based on this...
متن کاملCrochemore's String Matching Algorithm: Simplification, Extensions, Applications
We address the problem of string matching in the special case where the pattern is very long. First, constant extra space algorithms are desirable with long patterns, and we describe a simplified version of Crochemore’s algorithm retaining its linear time complexity and constant extra space usage. Second, long patterns are unlikely to occur in the text at all. Thus we define a generalization of...
متن کاملOn the computational complexity of finding a minimal basis for the guess and determine attack
Guess-and-determine attack is one of the general attacks on stream ciphers. It is a common cryptanalysis tool for evaluating security of stream ciphers. The effectiveness of this attack is based on the number of unknown bits which will be guessed by the attacker to break the cryptosystem. In this work, we present a relation between the minimum numbers of the guessed bits and uniquely restricted...
متن کاملCentralized Clustering Method To Increase Accuracy In Ontology Matching Systems
Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Algorithms
دوره 29 شماره
صفحات -
تاریخ انتشار 1998